Clusterer ensemble
نویسندگان
چکیده
Ensemble methods that train multiple learners and then combine their predictions have been shown to be very effective in supervised learning. This paper explores ensemble methods for unsupervised learning. Here an ensemble comprises multiple clusterers, each of which is trained by k-means algorithm with different initial points. The clusters discovered by different clusterers are aligned, i.e. similar clusters are assigned with the same label, by counting their overlapped data items. Then, four methods are developed to combine the aligned clusterers. Experiments show that clustering performance could be significantly improved by ensemble methods, where utilizing mutual information to select a subset of clusterers for weighted voting is a nice choice. Since the proposed methods work by analyzing the clustering results instead of the internal mechanisms of the component clusterers, they are applicable to diverse kinds of clustering algorithms.
منابع مشابه
A hierarchical clusterer ensemble method based on boosting theory
Bagging and boosting are two successful well-known methods for developing classifier ensembles. It is recognized that the clusterer ensemble methods which utilize the boosting concept, can create clusterings with quality and robustness improvement. In this paper, we introduce a new boosting based hierarchical clusterer ensemble method called Bob-Hic. This method is utilized to create a consensu...
متن کاملA cascaded classification approach to disambiguating polysemous mentions with social chains
This paper considers five features including titles, community chains, terms, temporal expressions, and hostnames for personal name disambiguation. In nine test data sets covering three ambiguous personal names, we address the issues of awareness degree of an entity, the source of materials and web pages in different areas. In a single-clusterer approach, employing all features achieve average ...
متن کاملThe analysis of the addition of stochasticity to a neural tree classifier
This paper describes various mechanisms for adding stochasticity to a dynamic hierarchical neural clusterer. Such a network grows a tree-structured neural classifier dynamically in response to the unlabelled data with which it is presented. Experiments are undertaken to evaluate the effects of this addition of stochasticity. These tests were carried out using two sets of internal parameters, th...
متن کاملGene sequence data sets analysed using a hierarchical neural clusterer
Evolutionary Algorithms have been used to optimise the performance of neural network models before. This paper uses a hybrid approach by permanently attaching a Genetic Algorithm (GA) to a hierarchical clusterer to investigate appropriate parameter values for producing specific tree shaped representations for some gene sequence data. It addresses a particular problem where the size of the data ...
متن کاملA Preprocessing Technique to Investigate the Stability of Multi-Objective Heuristic Ensemble Classifiers
Background and Objectives: According to the random nature of heuristic algorithms, stability analysis of heuristic ensemble classifiers has particular importance. Methods: The novelty of this paper is using a statistical method consists of Plackett-Burman design, and Taguchi for the first time to specify not only important parameters, but also optimal levels for them. Minitab and Design Expert ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Knowl.-Based Syst.
دوره 19 شماره
صفحات -
تاریخ انتشار 2006